Verification Formula Simplification #5026

FauziAkram · 2024-01-31T22:59:36Z

Simplifying the verification formula, which was added in order to better detect zugzwang positions.

Passed STC:
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 60160 W: 15529 L: 15336 D: 29295
Ptnml(0-2): 241, 6702, 16005, 6887, 245
https://tests.stockfishchess.org/tests/view/65b78dffc865510db027530e

Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 159414 W: 39970 L: 39891 D: 79553
Ptnml(0-2): 95, 17369, 44694, 17460, 89
https://tests.stockfishchess.org/tests/view/65b80835c865510db0275b73

Benchmark with zugzwang-suite (see #1338 ), single core, max 45 secs per position:
Patch solves 31 out of 37 - (After the study that follows changed to 32 out of 37)
Master solves 32 out of 37

However, there is alot to be said about this epd suite after I studied it and our master and this patch interaction with it.

Below are the important notes that I was able to deduct.

Position #17: Both variations were able to solve it, but tbh master solved it much quicker than patch.
Position #18 : None of the two variations were able to solve it
Position #19 : None of the two variations were able to solve it
Position #21 : The epd solution here is wrong, as it states that Ke3 and Kd3 are both valid solutions, but in reality, only Kd3 leads to white to win, while Ke3 leads only to draw, and by the way both master and patch didn't see the win, master opted for Ke3 with eval of 0.00 and patch opted for Nc3 also with eval 0.00, but none of them was able to see that Kd3 wins (And epd should be adjusted to remove Ke3 as a valid solution)
Position #28 : Both variations saw the win, but both of them opted to waste two extra plys with a repetition, opting for Nc3+ Kb4 Nd5+ Ka4, and then going for the valid solution with f4, and to be honest I don't see anything wrong with it, as long as the win was foreseen.
Position #33 : Both variations similarly opted for Kd7, which similarly to position 28, also leads to a win, but with a few extra plys, and both of them get stuck for a long time when analyzing this position (@peregrineshahin so, since you are investigating this SF issue, you can also use this position as testing material)
Position #36: Both engines similarly opted for Kb2, with fail high evaluations, and tbh even though I didn't do a deep study, I have the feeling that also here the epd solutions are not correct, as they only give Kb3 and Rf5 as valid solutions, but based on a quick check, it seems that also Kb2 is a valid solution after all.

Anyway in conclusion, I didn't observe huge differences in zugzwang detection for the proposed patch, the only two negative points I observed were:

Position number 17 slower solution
Position number 33 the patch version seems to me that it got stuck for a longer time than master, but I'm not sure also here.

Files:
Analyses.log
Analyses.txt

bench: 1492957

pb00067 · 2024-02-02T20:44:28Z

Hm, I'm a bit reluctant to see this as a simplification.
Sure it's a simplification for computation, but from logical point of view it is not simplier IMO.
Both in master and patch the verification search is done with search depth = depth - R (search.cpp line 806)
So currently in master there's a simple and understandable rule:
-during verification search the nm-pruning gets disabled for the first 3/4 plies of the nominal search tree.
The patch turns this rule into something more complicated imo.
Anyhow i'm OK with this patch, important is that the zugzwang detection improves with higher search depth and this seem still to be guaranteed.

vondele · 2024-02-03T16:18:02Z

I'm also thinking along the lines of @pb00067 this is not really simplifying the logic here. NMP is using a depth - R as the base variable. So what remains is just a parameter change.

peregrineshahin · 2024-02-03T16:25:29Z

I have some concerns about the logic in master for depth - R getting negative rather in the patch the value is always theoretically positive, I wonder about the reason how practically this doesn't happen.. as I'm getting a dbg_hit_on of 0% for R > depth on that particular line with very high depths suggesting that somehow for such high R relative to depth we practically never do the verification probably because of

            if (thisThread->nmpMinPly || depth < 15)
                return nullValue;

It would be probably more robust to have something like the bellow to prevent it from becoming negative
thisThread->nmpMinPly = ss->ply + 3 * (std::max(0, depth - R)) / 4;
but I'm not very sure if my concern is practically valid.

FauziAkram · 2024-02-03T16:29:31Z

That's ok for me, if this PR get closed, I will try again the removal of the *3/4 parameter in the formula, as cj also suggested in Discord.

However, one important thing that I learned while diving into this, is that the zugzwang test suite that we have is not very useful if you just run it blindly because:

It contains some mistakes.
In other cases SF chooses the right solution, but without realizing that it's a winning line.
In other cases SF chooses a different move, going for a single repetition before going to the correct move, but in the analysis it shows that he is able to identify the winning strategy.

So running an engine through the test suite reporting only the number of final solutions found given by the GUI is not a good practice.
The best would be to observe SF behavior on each position for the two different patches that we are testing (master, and the new potential patch).

pb00067 · 2024-02-05T07:13:42Z

@peregrineshahin

I wonder about the reason how practically this doesn't happen.

because it's mathematically prof:
-verification is only done with depth >= 16
-the first summand in std::min(int(eval - beta) / 144, 6) + depth / 3 + 4 is maximal 6

so we have 6 + 16/3 + 4
which is 15 meaning that R can reach maximal value of 15 while depth is minimum 16
so depth - R is always >= 1

peregrineshahin · 2024-02-05T07:18:14Z

@peregrineshahin

I wonder about the reason how practically this doesn't happen.

because it's mathematically prof: -verification is only done with depth >= 16 -the first summand in std::min(int(eval - beta) / 144, 6) + depth / 3 + 4 is maximal 6

so we have 6 + 16/3 + 4 which is 15 meaning that R can reach maximal value of 15 while depth is minimum 16 so depth - R is always >= 1

Makes sense, probably the tune going from 14 to 15 to 16 helped you in proving this :P

pb00067 · 2024-02-05T07:29:54Z

@FauziAkram

to point 1:
the zugzwang test suite we use is AFAIK basically an expansion of the allgeuer-zugzwang testsuite which I found linked on chessprogramming wiki. An yes is contains some mistakes or at least some positions which imo aren't really based on zugzwang motif. We could build up a more correct and meaningful testsuite but it requires some effort.

to point 2:
generally speaking engines aim to find the best move and that's also enough.
we can't pretend the engine to predict the exact PV until the game is converted.
But I agree the suite, should only consist of positions where the zugzwang motiv is decisive to convert the game.

to point 3:
these positions have to be figured out one by one.
In a epd-position is it also possible to define multiple best continuations, so these epd-positions can probably be adapted to allow alternative moves.

pb00067 · 2024-02-07T15:46:12Z

@peregrineshahin

Makes sense, probably the tune going from 14 to 15 to 16 helped you in proving this :P
By the way, I hate it when tune sessions exploit this parameter by increasing it to help their patches passing.
Imo tuning sessions should leave this parameter as it is.

peregrineshahin · 2024-02-07T15:52:44Z

I do also think that two tunes going from 14 to 15 to 16 have helped passing those VLTC tunes

By the way, I hate it when tune sessions exploit this parameter by increasing it to help their patches passing.
Imo tuning sessions should leave this parameter as it is.

FauziAkram · 2024-02-07T15:54:52Z

which parameter are talking about, that changed with time from 14 to 15 to 16 ?

peregrineshahin · 2024-02-07T15:59:26Z

if (thisThread->nmpMinPly || depth < 16) return nullValue:

Which means do not do verification search for depth < 16 and return the null move search value immediatly.

pb00067 · 2024-02-07T16:01:37Z

which parameter are talking about, that changed with time from 14 to 15 to 16 ?

Yes, that one. No hard feelings, but I think this parameter should be left out in tuning sessions,
otherwise it will be shifted up and up with time.

bench: 1492957

fe7b4b8

bench: 1492957

FauziAkram changed the title ~~Simplification~~ Verification Formula Simplification Feb 1, 2024

vondele closed this Feb 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verification Formula Simplification #5026

Verification Formula Simplification #5026

FauziAkram commented Jan 31, 2024 •

edited

Loading

pb00067 commented Feb 2, 2024 •

edited

Loading

vondele commented Feb 3, 2024

peregrineshahin commented Feb 3, 2024 •

edited

Loading

FauziAkram commented Feb 3, 2024 •

edited

Loading

pb00067 commented Feb 5, 2024

peregrineshahin commented Feb 5, 2024 •

edited

Loading

pb00067 commented Feb 5, 2024

pb00067 commented Feb 7, 2024

peregrineshahin commented Feb 7, 2024

FauziAkram commented Feb 7, 2024

peregrineshahin commented Feb 7, 2024

pb00067 commented Feb 7, 2024

Verification Formula Simplification #5026

Verification Formula Simplification #5026

Conversation

FauziAkram commented Jan 31, 2024 • edited Loading

pb00067 commented Feb 2, 2024 • edited Loading

vondele commented Feb 3, 2024

peregrineshahin commented Feb 3, 2024 • edited Loading

FauziAkram commented Feb 3, 2024 • edited Loading

pb00067 commented Feb 5, 2024

peregrineshahin commented Feb 5, 2024 • edited Loading

pb00067 commented Feb 5, 2024

pb00067 commented Feb 7, 2024

peregrineshahin commented Feb 7, 2024

FauziAkram commented Feb 7, 2024

peregrineshahin commented Feb 7, 2024

pb00067 commented Feb 7, 2024

FauziAkram commented Jan 31, 2024 •

edited

Loading

pb00067 commented Feb 2, 2024 •

edited

Loading

peregrineshahin commented Feb 3, 2024 •

edited

Loading

FauziAkram commented Feb 3, 2024 •

edited

Loading

peregrineshahin commented Feb 5, 2024 •

edited

Loading